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PROJECTIVE DISTANCE AND (/-MEASURES 


L. TREJO-VALENCIA AND E. UGALDE 


Abstract. We introduce a distance in the space of fully-supported probabil¬ 
ity measures on one-dimensional symbolic spaces. We compare this distance to 
the d-distance and we prove that in general they are not comparable. Our pro¬ 
jective distance is inspired on Hilbert’s projective metric, and in the framework 
of g-measures, it allows to assess the continuity of the entropy at < 7 -measures 
satisfying uniqueness. It also allows to relate the speed of convergence and the 
regularity of sequences of locally finite g- functions, to the preservation at the 
limit, of certain ergodic properties for the associate ^-measures. 


1. Introduction. 

1.1. In pj] Hilbert introduced the so called projective distance, for which the ge¬ 
odesic are precisely the straight lines. It was later used by G. Birkhoff to prove 
the existence and uniqueness of positive eigenvectors for positive linear transforma¬ 
tions on Banach spaces |Tj. Birkhoff’s strategy goes as follows: uniformly positive 
bounded linear transformations map the positive cone of a Banach space into it¬ 
self. This transformation is non-expansive with respect to the projective distance, 
and if the image cone has finite diameter, then the transformation is a projective 
contraction. In this case Banch’s fixed point Theorem ensures the existence and 
uniqueness of a projective fixed point for the linear transformation, and projective 
fixed points are nothing but positive eigenvectors. Furthermore, the contractive¬ 
ness ensures that the iterations of the linear transformation on any positive vector 
converge exponentially fast, in the projective sense, towards to the fixed point. 
Birkhoff’s strategy has been successfully employed in the solution of a variety of 
problems, in particular to prove existence and uniqueness of invariant measures, 
and the exponential decay of correlations of convenient observables. This has been 
done for symbolic systems [lOj [23], for suitable one-dimensional maps mm, and 
for general maps with some degree of lryperbolicity da cm]. 

Ornstein’s d-distance was introduced in [52] to give a topological characterization 
to the Bernoulli processes. This distance generates a topological structure well 
adapted to the study of important ergodic properties. For instance, d-limits of 
sequences of mixing processes are mixing, the class of Bernoulli processes is d-closed, 
as well as the class of A-processes. Bressaud and coauthors, in a study of Markov 
approximation to g -measures (chains of complete connection in their nomenclature), 
found an upper bound for the speed of d-convergence of the approximations related 
to the regularity of the (/-function [3]. In a related work [7], Coelho and Quas 
studied the d-continuity of (/-measures with respect to the uniform distance between 
(/-functions. 
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1.2. In |5j we stablished a relation between the rate of projective-convergence of 
the Markovian approximations of a one-dimensional Gibbs measures and the decay 
of correlations of the limiting Gibbs measure. The result extends straightforwardly 
to the case on (/-measures defined by sufficiently regular (/-functions. Our technique 
relies on a projective comparison of the marginals of the approximating measures. 
If the potential defining the Gibbs measure is sufficiently regular, then the finite 
range approximations are sufficiently similar “in the projective sense”, and in this 
case the mixing rate of the Gibbs measure can be upper bounded by a function of 
the mixing rates of the approximations. Additionally, in this fast approximation 
regime, the entropy of the approximations converges toward the entropy of the 
Gibbs measure. Furthermore, since in that case the relative entropy of the limiting 
Gibbs measure with respect to the approximations goes to cero, then Marton’s 
bounds PUEI ensures the convergence of the approximations in d-distance. In 
a recent work [2D], Maldonado and Salgado applied our approach to study the 
approximability of Gibbs measure for two-body interactions in one dimensional 
symbolic systems. This technique was also used in our study of the preservation of 
Gibbsianness under amalgamation of symbols [6] . 

1.3. Despite its actual and potential applications, our notion of “projective con¬ 
vergence” has not yet been formalized, neither its relation to d-converges or vague 
convergence has been established. The aim of this paper is to fill this gap and to 
explore to which extent the projective convergence as we define it, is well adapted 
to study particular classes of processes. We consider in particular the class of g- 
measures, leaving for a forthcoming work the study of measures obtained by random 
substitutions for which we already have some preliminary results. The rest of the 
paper is organized as follows. The next section is devoted to the study of some 
general properties of the projective distance, particularly its relation to the vague 
distance and the d-distance. In Section [3] we study the convergence of Markov 
approximations to a g-measure, the continuity of the entropy at (/-measures satis¬ 
fying uniqueness, and we establish a criterion for uniqueness based on the speed 
of convergence and regularity of Markov approximations. Section [4] contains some 
concluding remark and and perspectives. 
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thank Laboratorio Internacional Solomon Lefschetz the financing of our academic 
exchange with Professor Chazottes from Ecole Polytechnique. L. Trejo-Valencia is 
supported by CONACyT through the Ph. D. Fellowship 332432. 

2. Projective Distance 

2.1. Let A be a finite set, which we also called alphabet, and let X := A N the set 
of infinite A-valued sequences. As usual, the elements of A will be called symbols 
and words the finite tuples in A. Given x = X\X 2 ■ ■ ■ € A N and natural numbers 
1 < n < m, x™ denote the word x n x n+ \. . . 2 m _i 2 m . The left shift T : A N —)• A N 
is such that ( Tx)i = Xi+i for all i £ N. The pair (X,T) is the full shift on the 
alphabet A. 
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To a word a G A n ,n G N, we associate the cylinder set [a] := {x G A N : = 

a}. Cylinder sets are clopen in the standard Tyclionoff topology and generate the 
corresponding Borel cr-algebra B{X). We denote by A4(X) the set of all Borel 
probability measures on X and by J\At(X) the subset of T-invariant probability 
measures. Both M(X) and Mt{X) are compact convex sets in vague topology. 
The vague topology can be metrized by the distance 

(1) dm-.= Y, 2- n ( E IM«]-K«]|) 

It is known that A4(X) as well as A ir(X) are convex set, complete and separable 
in the vague topology. Furthermore, they have the structure of a simplex, which, 
in the case of Mt(X ) implies the uniqueness of the ergodic decomposition [8]. 

Given p, v G M(X), a coupling between p and u is a measure A G M.{{A x A) N ) 
such that for all «gN, 

A[o x b] = p[a], ^ A [a x b] = is[b]. 

b£.A n a(=:A n 

Here a x 6 = {aibi)(a 2 b 2 ) ■ ■ ■ (a n b n ) G {A x A) n , for each a, 6 G A n . With J(p, v) C 
A4((A x A) N ) we denote the set of all couplings between p and v. Ornstein’s 
d-distance is given by 

1 ”~ 1 

(2) d{n,v) = inf limsup — X(T~ k A), 

n—too U o 

where A = {ab G d x i : o / i} is the complement of the diagonal. Distance 
d makes M(X) a complete but non-separable topological space. The same holds 
when d is restricted to the subspace of T-invariant measures A 4t(X) (see [22j for 
instance). 


2.2. Let Ad + {X) C Xi{X) be the set of fully-supported Borel probability measures 
on X , i.e., n G Af + (X) if and only if //[a] > 0 for all a G U„ e NA". We define 
p : M + {X) x M + {X) ->• M+ by 


( 3 ) 


p(a, v) = sup max — 
r>eN“ e ^” n 



The function p defines a distance on Ad + (X) which we call projective distance. 


Theorem 1. A4 + (X) is a complete metric space with respect to p. 


Proof. Let us first verify that p defines a metric. Clearly p(/x, v) > 0 for all p, v G 
Ad + (X), and p{p,v) = 0 if and only if and only if p[a] = v[a\ for all n G N and 
a G A n which readily implies p = v. Now, since for all n G N and a G A n and each 
A G M + (X) we have 



log 


p [a] A [a] 
y[a]A[a] 


. p[a\ , A [a] 

lQ g tH+ lo § 4i 

A[aJ v[a\ 


< 


log 


p\a\ 

A [a] 


then p(p, v) < p(p, A) + p{ A, u) for all p, A, v G M + (X). 



Let p, v G M + {X) be such that p(p, v) < log(2), then all n G N and a G A n we have 
e -»*(#.,*')!/[ a ] < n[a] < e np ^v[al which implies \p[a] - v[a]\ < ( e np ^ - 1 )i/[a], 
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and from this 

(4) Dip, „)<J2 2-”(e"4«-> - 1) = < ipip, v). 

new 

With this we prove that the vague topology is weaker than the one induced by p. 

Let us now prove that M + {X) is complete with respect to the distance p. For this 
let {pm }m£N be a Cauchy sequence with respect to p, which is a Cauchy sequence 
respect to D as well. Since D makes A4(X) a complete space, then there exists 
p £ A4(X) towards which {p m } m eN converges. Now, for each n £ N, a £ A n and 
every m £ N, we have e~ np( - pm,pi ^ pi[a] < p m [a], therefore 

p[a] = lim p m [a] < /i 1 [a]e-" Bup ™«rt" 1 >"'"> > 0, 

m—> oo 

which proves that p £ M + (X). Finally, since p[a] = lin^^oo p m [a] , we have 

e —nsup m > m0 p(Mm,Atm 0 ) < /fM < 

Pm [®] 

for each n £ N, a £ A n and mo £ N. From this it follows that 

P(P] Pm 0 ) — SUp P^Prni pmo)■> 
m>mo 

which proves that p is the limit of {p m }meN in the projective distance. □ 


As mentioned above, A4(X) is separable in the vague topology while it is non- 
separable with respect to the topology induced by d. In this respect, regarding the 
projective distance we have the following. 

Theorem 2. Xi + (X) is non-separable with respect to p. 

Proof. We will exhibit a collection {p x £ M + (X) : x £ {0,1} N }, such that 
P ( Px) Py) >1/2 whenever x j^y. 

Fix x £ {0,1} N , and for each n £ N and a £ {0,1}" let 

q(a) = max{l < k < n : a\ = x{'} + 1. 

Now, fix a > 1 and let v x £ A! + ({0,1} N ) be given by 


( 5 ) 


v x \a\ — 


a n (l + a) n if a = x", 

a«( o) - 1 (l + a)-«(°)2«( £ b- n ifa^x?, 


for all n and a £ {0,1}". 

Let us check that v x is well defined. For this notice that 


Y v x\ a \ = v x[Xx 

ae{0,l}" 


E 


v x \a\ 


ae{0,l}"\{x"} 


1 + a 


1 + a 


1 


1 + a 


E 

9=1 


1 H- ol 


q -1 


#{a € {0,1}" : q(a ) = m} 


2"“ q 


1 - (a/(l + a))" 

11“ a } 1 — ex/ (1 ex') 


= 1, 
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which proves that the marginals are well normalized. Now, if a £ A n is such that 
q(a) < n, then q(ab) = q(a) for all b £ A, and 

v 9(°)-i 2 

= v x [a]. 


y, v x [a&] = 

foefo.i} 

Otherwise, if a = x", then 


a? 


(1 a ) q ( a ' 1 2 n + 1 ~ A a ) 


^ ' Vx ntj — V x T ^ ' Vx [u//| 

foC A\{x n + 1 } 


beA 


( a ) 

n+1 a n 

l — 

( a ) 

\l + a) 

(1+ <*)"+! 

{l + aj 


= V x a . 


We have proven that the marginals are well normalized and compatible, which 
ensures that v x is well defined. 

For y ^ x let to = min{/c £ N : ^ Xk}- Then we have 


p(v x ,v y ) > lim sup — 

n—too Tl 


log 


Vxi X l\ 


= lim sup — log 

n—too Tl 


= lim — log 

n—too TL 


l (l + a)- 


a Av D-!(l + 
a n (l+a)~ n 


= log 


2a 


1 — a 


a m ~ 1 ( 1 + a )- m 2 m ~ n 

By taking a = e 1//2 /(2 — e 1 / 2 ) we obtain p(v x ,v y ) > 1/2 for all i/y. 

Now, consider any surjective map ir : A —> {0,1} and for each n £ N extend it 
coordinatewise to A n . We will denote all those coordinatewise extensions with the 
same letter 7 r. For each x £ {0,1} N the measure p x £ M + (X) is given by 

z'xbr(a)] 


( 6 ) 


Px [^] — 


#7r- 1 (7r(a))' 

This measure is well defined since for each n £ N 


y Px[a\ = y #tt \b) 


J / x [^] 


#7T 1 (b) 


= 1, 


and for each a £ A n 


y p x [aa/] = y 

a'eA 


v x ['K{a)-K{a')} 


#7r- 1 ( 7 r(a)7r(a'))’ 

v x [n(a)b] 
(n(a 

foe { 0 , 1 } 


a'eA 

y #7T ~\b)- 


#tt 1 (n(a))#n 1 (b) 


— M x [G] ■ 


Now, for x 7 ^ y we have 


1 


p{Px,Py) = sup-max 

neN n aeA ra 


sup — max 
neN n aeA n 


log 

log 




Py [°] 
v x [n(a)\ 


sup — max 
new n foe{o,i} n 


log- 


r(a)] 

'*[&] 


lb] 


— p{v x ,V y ) A 1/2. 
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In this way we obtain the desired uncountable collection {p x G M + (X) : x G 
{0,1} N } such that p(p Xl p y ) > 1/2 whenever x ^ y. □ 

2.3. According to Equation (j4j), the vague topology is coarser than the projective 
topology (the one induce by p). It is well known, and easy to argue, that the el- 
topology is finer than the vague topology, and it remains to know how to place the 
projective topology with respect to the d- topology. Below we will prove that p is 
not coaser that d. With this, and a construction based on ^-measures which we 
will present in Section [3l we will be able to complete the proof that p and d are not 
comparable. 

Theorem 3. There exists a sequence {p p G .Ad + (A')} pe jij converging in d-distance, 
but not in the projective distance. 


Proof. Let p x G M + (X) be as in the proof of Theorem [2] We will exhibit a 
sequence {x p G {0, 1} n } pS n such {/i Xp }peN converges with respect to d. 

Fix x G {0,1} N and for each p G N let x p G {0,1} N be such that 

/ n _ J 1 — Xk iffcGpN + 1, 

[Xp)k ~\x k iffcgpN+1. 

Consider the measures p Xp and p x as defined in Equation ([6]). Let us remind that for 
each y G {0,1}, the measure p y G M.{X) is induced by a corresponding measure 
v y G A1({0,1} N ), defined in Equation d5J, via a projection 7r : A —>• {0,1}. Let 
r : A —> A be a permutation satisfying r(a) G 7r _1 (l — 7r(a)) for each a £ A and 
with this, for each nGN define the permutation r p : A n —> A n such that 




r(afc) if k G pN + 1, 
ak if k ef pN + 1. 


We will denote all those permutations with the same symbol t p . With this we define 
the coupling X p G J(p Xp ,Px) such that for each a x b G (A x A) n 


A p [a x b] 


p x [a] if b = T p (a), 
0 otherwise. 


The permutation r is designed so that \ak — Xk\ = |r p (a)fc — (x p )k\ for all 1 < k < n. 
This ensures that p x [o] = Px bp( a )]’ f rom which it follows that A p is a coupling. 
By using this coupling we obtain 


1 n 

d(p x ,Px p ) < limsup- E A p (T- fc A) 

n ^°° n k = i 
1 n 

= limsup — V A„{a x 6 G (A 

n—too Tl 


■ dk 7^ b k } 


fe=l 


= lim sup 

n—>oo 


#({l,2,...,n}n(pN+l)) 1 


In this way we have proved that p x = lim^oo p Xp in d-distance. 

Theorem [2] ensures that p(p, Xp , Px p ,) > 1/2 for all p / p'. The theorem follows by 
taking p, p := p, Xp . □ 
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3. G-MEASURES 

3.1. Let us start with a brief reminder of g-measures. A (/-function is any Borel 
measurable function g : X —> (0,1) satisfying g(x) = 1, and a compatible 
(/-measure is any /i G M^(X) := M + (X) 0 Mt{X) satisfying 

(7) lira n(xx = ai|x£ = a£) := lirn = g{a), 

n—too n—> oo 

for all a G X. This notion is intended to generalize that of Markov chain and was 
introduced into ergodic theory by M. Keane in [T3] . It has as ancestor the so called 
chains with complete connections studied in probability theory as early as 1935 [25]. 
This notion is related, and under some conditions is equivalent, to the notion of 
equilibrium states 130] 115] . One of the main problems concerning (/-measures is 
whether a given ^-function admits a unique compatible (/-measure. Existence of 
compatible (/-measures requires only the continuity of g, while stronger continuity 
conditions are needed to ensure uniqueness. For instance, Holder continuity of the 
(/-function implies the existence and uniqueness of a compatible (/-measure for which 
strong mixing holds. Several criteria have been established to ensure uniqueness, 
all of them relying on the regularity of the (/-function. As mentioned in Section [T) 
several works have considered the d-continuity of (/-measures under strong regularity 
conditions for the limit (/-function, and have proved in this way that the limit g- 
measure has good ergodic properties (the Bernoullicity of the natural extension [7] 
or the fast decay of correlation W- On the other hand, several examples have been 
proposed to show that the continuity of the (/-function is not enough to ensure 
the uniqueness of the corresponding (/-measure. Among those examples we find 
the already classical Bramson-Kalikow construction [2]. Recently P. Hulse m 
published a construction inspired on the Ising model with long range interactions, 
of a (/-function where uniqueness fails. For this example, the set of compatible 
(/-measures necessarily contains non-ergodic measures. 

3.2. Let us start by reminding the notions of variation of a function and that of 
Markov approximation to a measure. 

For 4> ■ X —> R. and each IgN, the ^-variation of <j> is given by 

(8) var £<j) := max < sup (j>{x) — inf 4>(x) > . 

aeA f ^ x 6 [a] zeM j 

For (f> continuous we necessarily have Hindoo var $ = 0. In this case, the speed of 
convergence of the variation characterizes the regularity of (f>. For instance, Holder 
continuity corresponds to exponential decreasing of the variation. 

Given /j G Af(A'), for each (, G N, the canonical Astep Markov approximation to fi 
is the only measure fu G M(X) satisfying 

n ~ t u \a j+i ] 

( 9 ) Ma n i) = M n 

7=1 /'«:!• ] 

for all a G X and n > £. 

It is well known and easily proved that /t/ —> g, as l —> oo in the vague topology. In 
this respect, concerning the (/-measures, we have the following theorem. 
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Theorem 4. Let g : X —» [0,1] be a continuous g-function and g £ Ai(X) a com¬ 
patible g-measure. For each ( £ N let ne £ M(X) be the canonical (.-step Markov 
approximation. Then ge —> p as ( —> oo in the projective distance. Furthermore, 

< var n log og. 


Proof. First note that for all a £ X and n < m we have 

\ - /#1 iKl =F ( iA*T\ \ 

with p : A m —» (0,1) a probability distribution given by 

m/p[b2] if 6" = a", 

0 otherwise. 



It follows from this, and taking the limit m —> oo, that 


( 10 ) 


min q(x ) < 

x£[a[] 


tK] 

p[a^\ 


< max q(x), 


for all a £ X and ( < n. 


For n < ( we have ixe[df\ = /i[a™] for all a £ X. On the other hand, for n > ( and 
a £ X by writing 

n ^t 1 /da? ] 
tK] = n 

j=l ^ a j+lJ 
n— t— 1 r J+* 

wKl = II Tl+o x 


we readily obtain 


log 


l 


wKl 

Inequalities imply 



log 




-logdi 




■j+u 


/4 a ' 


1+-C 

j'+iJ 


1 

n 




< - > 

1=1 


xe[o 

< var^ log og, 


IIldA. LUX. 


min logog(x) 
*e[a} +< ] 


for all a £ X and n £ N, from which it follows that p(/x^,/x) < var^ log og, and the 
proof is done. 


□ 


3.3. Let us describe the construction by P. Hulse cited above, which we slightly 
modify to fit in our context. Consider the real map 1 H>■ ip(t) = e t (e t + e - *)” 1 and 
fix sequences {he £ R + }^L 0 , {h{ £ R + }£i 0 , {Je £ R + }^!, and {A^ £ N}^ 0 . Let 
7r : A —y {—1,0,1} be such that ^=7r _1 ({l}) = #7r _1 ({— 1}) = |_#A/2_|. With this 
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define the locally constant functions {ge,g' e : X —> [0, l]]r e N given by 

( 11 ) 


( 12 ) 


9i{x) = -0 Jk ( 7r ( x ))A k + h^j j . 


g[{x) = t/> /3t r(xO V J k (n{x)) Ak 



where (7t(x))a = A -1 X)m=i ^(^fe) for each A £ N. Now, for each f £ N, both ge 
and g' t are constants inside each cylinder of length A^, therefore Walters’ criterion 
(logarithm with summable variations [JO]) ensures the existence and uniqueness of 
(/-measures ge and g{ compatible with ge and g[ respectively. Hulse’s construction 
consist on determining sequences {he £ R + }^o> We e ® + }^=oi Wf- £ ^ + }fc=n an< l 
{A^ £ N}^L 0 , ensuring that {(/^jrgN and {(^}rgN have a common continuous limit 
g : X —> [0,1], while {^} n gN and {/i4}t6N do not converge to the same measure. 
In this way he obtains a simplex (made of all the convex combinations of the two 
different limiting measures) of compatible (/-measures. 

From Hulse’s construction and Theorem H] it readily follows the next result. 

Theorem 5. There exists a sequence {ge £ A4 + (A')}^gN converging in the projec¬ 
tive distances, but not in the d-distance. 

Proof. Let g : A —> [0,1] be the (/-function in Hulse’s construction above, and 
let A 4(g) the collection of all the compatible (/-measures. Since A4(g) is not a 
singleton, then it necessarily contains non-ergodic measures, for instance any strict 
convex combination of two different extremal measures. Let g be such a non-ergodic 
measure. Now, for each <6N, let ge be the Astep Markov approximation to g, as 
defined in Equation ©. According to Theorem [4] the sequence {/q?}^ e N converges 
to g in the projective distance. It is know that d-limits of mixing measures are 
mixing (see Theorem 1.9.17 in |29l for instance). Since g is fully-supported, then 
ge is a mixing measure for each l £ N but since g is not even ergodic, then 
cannot converge in d-distance. □ 

3.4. It is know that the entropy is a d-continuous functional in the class of er¬ 
godic processes (Theorem 1.9.16 in [29]), while it is only upper semicontinuous with 
respect to the vague topology (Theorem 1.9.1 in [29]). Concerning the projective 
distance, we have the following result. 

Theorem 6. Assume g admits a unique g-measure g (in which case this measure 
is ergodic), and suppose that {/j p } p6 n is a sequence of ergodic measures converging 
to g in the projective distance, then 


lim h(g p ) = h(g) = — / logoi/ dg. 

J 


Proof. First we prove that the relative entropy 


h{g P \g) ■= lim - Y Mp[ a ] lo S^TT’ 

r7._i.nn n * —' /i [flj 


n—too Tl 

aeA n 

which can easily proved to be non-negative, converges to zero as p —> oo. Indeed 
since 

e ~np(n P ,n) < M P [ Q ] < e np(p p ,p) 

g[a\ 
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for each n £ N and a £ A n , then 


H P [a} 


0 < h{p p \p) = lim - MpH'ogn- 

n ->oo n ' /i fl 

aGA™ J 


< lim - V p p [a]np{p p ,p) = p(p p ,p), 

n—>■ oo Tl zJ 


aeA" 


and the claim follows. Now, following the arguments in [4] Section 3.2], we readily 
deduce that 


h(p p \p) = -h(pp) - / logo g dp p . 

J x 


Now, since the topology of the projective distance is finer than the vague topology, 
we necessarily have 

lim / log og dp p = / log og dp. 

p->oo J x J x 

Finally, the Variational Principle for ^-measures (see jl6] for a proof) establishes 
that 


Kp) = - / log og dp. 
J x 


From all the above arguments it follows that 


lim h(p)~h(p p ) = lim — / log og dp — h(p p ) ) 
)->°o p-too \ J x J 

= lim - / \ogogdp p - h(p p ) I 
p->oo \ J x J 


= lim h(p p \p) = 0, 

p—t OO 


and the proof is done. 


□ 


3.5. In this paragraph we explore the relationship between convergence of g- 
functions and the possible convergence in projective distance, of the correspond¬ 
ing (/-measures. An analogous result, concerning the d-distance, was obtained by 
Coelho and Quas in [7] . Before stating our result, let us fix some notation. 

Let Q C Co (AT) denote the set of (/-functions, i. e. the set of continuous functions 
g : X —» (0,1) satisfying ^ nEj4 j(ai) = 1, Vx £ X. Now, for g £ Q denote by 
JA(g) C Ad (AT) the simplex made of all probability measures compatible with g (or 
(/-measures) as defined in Equation (|7|). 

For <j> : X — > R and JV € N, let us denote svar^(/> = J2k=i var fc0 where var*,^ is 
defined as in Equation (0 . We will say that a locally constant function f> : X —> R. 
has range whenever 

x[ =y[ => (j>(x) = (j){y). 

Clearly, for a locally constant function of range £, var n (f> = 0 for all n > £. It is not 
hard to prove that if g £ Q is locally constant of range £ + 1, then A i(g) contains a 
unique Astep Markov measure (see Section lA.il for details). We have the following. 

Theorem 7. Let {ge £ £/} fgN a sequence of locally constant functions converging 
to g in the sup-norm, and such that for each £ £ N the function gt is locally constant 
of range £ + 1. If 

lim || log(< 7 /^)||e svar ^ los ° 9f = 0, 

£—>■00 
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then the sequences where ge is the unique measure in Ai (ge), converges 

in projective distance. Furthermore, the limit measure g £ A4(X) is the unique 
measure in M. (g). 

Proof. First note that var m log oge = 0 and that both ge and ge are m-step Markov 
measures. From Proposition [2] in the Appendix, it follows that 

p(/W/) < 2 || log(< 7 m/ ge) ||e min ^ svarf log °«> svai 'm lo s°s™) 

< 2 (|| log($/ 5/ )|| + || 

< 2 || log( 5 /^)|e sv ^ los °^ + 2 || log(g/( 7 rn )|e svaim log ° Sm , 

for all to > l. The hypothesis of the theorem implies that is a Cauchy 

sequence in projective distance, and by Theorem Q] it must converge in projective 
distance to a certain measure g £ M + (X). 

Now, since g = lim^oo < 7 ^ in the sup-norm, then necessarily g £ Q. Let v £ A4(g) 
and for each l £ N let vg be its canonical £-step Markov approximation. Let he be 
the locally constant ^-function associate to vg, i. e. he(x) = vlx 1 ^ 1 }/v[x\\ for all 
x £ X. According to Inequalities (fTOl) we have 

min logog(y) < logo^(x) < max logog(y), 

and from this || \og(g/hg)\\ < vaiy logog. Then, using again Lemma [5] we have 
p(lH,ve) < 211 log {gg/hg)\\ e svar * log ° 9e 

< 2(|| \og(g / hg)\\ + ||log(Ws)ll)e svar£los ° w 

< 2 (var,logo ff + || log( 5 ,/ 5 )||)e sv ^ logo «. 

Now, since var^ log o ge = 0 and 

vaie log o g < vaiy log og t + 11 log {ge/g) 11 = 11 log {ge/g) 11 , 

it follows that 

p(ge,ve)<4\\log(ge/g)\\e s ™* lo z°s‘, 

which ensures that {ve }tgn converges to g , but according to Theorem^ it converges 
to v as well, therefore g = v and the proof is finished. □ 

Example 1. Consider the sequence of ^-functions {ge : {—1,1} N —» (0, 1)}^gn given 

by e 

, \ _ exp(/3 xi £ fc=2 x k k~ 2 ) 

Q £ () — _ / _ o 

exp(+/3 EL 2 x k k~ 2 ) + exp(-/3 x fc k ~ 2 ) 

Clearly {<%}.£eN uniformly converges to the g : { — 1,1} N —> (0,1) given by 

M = _ exp(/3xiE^ 2 x kk~ 2 ) _ 

51 j cM+PEZz x k k- 2 ) + cxp(-/?Er =2 ** fc' 2 ) ■ 

Furthermore, a simple computation leads to the inequalities 

OO 

l|log(Wff)ll < 2/3 ^2 k~ 2 <2/3r 1 , 

k=e+i 

e 

< exp(4/3 ^(fc — 1) k~ 2 ) < exp(4/3 log(f)). 

fc =2 


exp(svar<? log o ge) 
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According to Theorem [3 the sequence {pe £ M.{gi)} converges in the projective 
distance to the unique ^-measure ji £ M.{gi), provided —> 0 when t —> oo, i. 

e., provided /? < 1/4. 


4. Concluding Remarks. 

With Theorems [3] and [5] we have established the incomparability of the J-topology 
and the projective topology in the set of fully-supported probability measures. It is 
nevertheless not clear if this incomparability remains in the restriction to the class 
of invariant probability measures. It is not hard to verify that the the projective 
distance between two Markov measures can be computed by means of a finite 
algorithm taking the parameters defining the measures as inputs. One can also 
argue that the output value varies continuously or at worst piecewise continuously 
with the input parameters. This this does not seem to be the case of the d distance, 
which suggests that in the class of Markov measures the projective topology is 
coarser than the d topology. 

Theorem [7] establishes a new criterion for uniqueness of ^-measures based on the 
speed of convergence of locally constant approximations to the ^-function. It can 
be related to a similar criterion ensuring convergence in d-distance established by 
Coelho and Quas in [7j. Although in our case we cannot deduce that the limit 
measure satisfies the Bernoulli property, we can nevertheless ensure that the limit 
measure inherits the mixing property of the Markov approximations, and thanks 
to Theorem [6j that the the entropy is continuous with respect to the projective 
distance at the limit measure. 

Example [l] is the ^-measure analog of the one-dimension Ising model with long 
range interaction, for which a phase transition has been proved to occur (see mm 
for details). The analogy suggests that the uniqueness of the associated g-measure 
must break at high values of the parameter /3. This transition should be detectable 
through a criterion involving the regularity of the ^-function and the speed of 
convergence of the Markov approximations. 

The projective distance appears to be suited for the study of measures obtained by 
random substitutions as the one we have characterized in [27: ■ We can prove that 
for a certain class of random substitutions, the substitution process is a contraction 
in the projective distance, and that the unique attractor has the mixing property. 
The study of this kind of processes and its characterization in terms of the projective 
distance is the subject of a forthcoming work. 

Appendix A. 

A.l. A n x n real matrix M is said to be primitive if M > 0 (i. e. none of its 
entries is negative) and for some k £ N, M k > 0 (i. e. all the entries of M k are 
positive). The primitivity index of a primitive matrix M is the smallest integer l 
such that M l > 0. The Perron-Frobenius Theorem ensures that the spectral radius 
( i. e. the maximal norm of its eigenvalues) of a primitive matrix M is achieved 
by a simple positive eigenvalue A with positive right and left eigenvectors v and w 
respectively. 
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The function d p { 0, oo) n x ( 0 , 00 )" —> [0,oo) such that 


(13) 


d P (x,y) 


, x t . . x t 

max log-mm log —, 

l<i<n yi l<i<n 


defines a projective pseudo-distance which becomes a distance when restricted to the 
simplex of probability vectors. A refined version of the Perron-Frobenius Theorem 
which we can find in j28l , establishes that the action of a nx n primitive matrix M 
with primitivity index £, over the cone (0, oo)" defines a contraction with respect 
to the projective pseudo-distance d p . More precisely, for all x,y £ (0,oo) n we have 


(14) d p (Mx,My) < d p (x,y) and d p (M e x, M e y) < T M d p (x,y), 


where 


(15) 


1 - 


t~m 


■ 




The coefficient tm is the so called Birkhoff’s contraction coefficient. 


Proposition 1. Let P,Q : {1,2, ...,n} x {1,2, ...,n} —> (0,1) be stochastic by 
columns, i. e., Xa=i P{hj) = Y^i=i Q(hj) = 1 f or eac h 3 € {1, 2,..., n}. Suppose 
that 


e e < P{i,j)/Q{i,j) < e £ 


for some e > 0 and each i : j £ {1, 2,..., n}. Then the maximal eigenvalue of both 
matrices is 1, and the associated positive right eigenvectors u, v are such that 


d p {u, v) < 


1 - min(rp, tq) ’ 


where Tp and tq are the Birkhoff coefficients of P and Q respectively. 


Proof. First note that a n x n positive matrix M, stochastic by columns, preserves 
the simplex of probability vectors A = {u € [0, l] n : X)"=i u (®) = !}• Therefore, 
according to Inequality (fTH) and Banach’s fixed point Theorem, the transformation 
u ^ Mu has a unique fixed point v £ A, which necessarily coincides with a 
positive eigenvector of M associated to the eigenvalue 1. Furthermore, because 
of the contractiveness of M with respect to d p , we have v = linin^oo M n u for all 
u £ A. Hence there cannot be another positive eigenvector which implies that 1 
necessarily is the maximal eigenvalue of M. In this way we prove in particular that 
1 is the maximal eigenvalue of both P and Q with unique eigenvectors u 7 v £ A 
respectively. 

Let us assume now that tq < tp 7 then 

N 

d p (u,v) < lim 'S'' d p (Q n u,Q n+1 u) + d p (Q N+1 ,v), 

N —>00 L ^ 
n =0 

< d p {u 7 Qu ) y 7-3 = dp(u ’ Qu) = dp ( PU: Qu) . 

l ~ T Q l ~ T Q 

Finally, since e _e < P(i,j)/Q(i,j ) < e e for all i, j £ {1,2,..., n}, then 

c -e . Eli p^jXj) e 
~ ELi ~ 
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for all 1 < * < n, and from this 


d p {Pu , Qu) 


max log 

l<£<n 


(Qu)(i) 


min log 

1 <i<n 


( Pu ){i) 

(Qu)(i) 


< 2 e. 


□ 

A.2. To a £-step Markov measure /r G A1 + (X) it corresponds a locally constant 
^-function g p : X — > (0,1) given by 


9v( x ) = 


H\x 




li[x 


,(+ ii 


and such that /i is the unique 5 ^-measure, i. e. M(g P ) = {g}. The function g p 
defines a primitive matrix M M : A 1 x A e —> [0,1] as follows: 


(16) 


Mp (, a[,b[) = 


_ / g^{ah) 
0 


if a{=b[~\ 
otherwise. 


It is easily verified that M 1 > 0 and that 1 is M p ’s maximal eigenvalue with right 
eigenvector v : A 1 —>- (0,1) such that v{a) = g[a}. From Proposition |T] we derive 
the following. 

Proposition 2. Let p,, v G M + (X) be two l-step Markov measures, and let gp,,g v G 
Q be the locally constant g-functions associated to g and v respectively. Then 

PM < 211 log( 3 J u/ g v )\\ e min ( svar <SM,svar <s „) . 

Proof. Let be such that u M (a) = g [a] for all a € A e , and similarly for v„. Then, 
Proposition [I] directly implies that 


d p (vp,,v v ) < 


2 l|iog(g M /g^)ll 

1 - min(r A1 , r„) 


It can be easily verified that < 1 — exp (—svar^ log og M ), and similarly for t v . 
From this it follows that 


d p (vp,v v ) < 21 1 log( <7^1 /) 1 1e min( - svarglog ° Sa ‘ ,svarg log ° Sl/ ). 

Let us remind that p{p,v) = supjv gN max ogj 4 N |log( / u[a]/i'[a])| /N. If the supreme 
is not reached at N < £, then 


P(P, v) 


< 


sup max 
jVeN«e^ w 


sup max 
NeN o.gA n 


1 

N 


Y lo § 


N =1 


1 

N 


n—Z 


Y lo § 

N =1 


( dr « +e ) \ 

U( a « + 0 / 

( 9n (< + Q \ 

U (<£+*) ) 


+ N 106 


+ jv log 


P\a 

v[a 

v p I 

v v I 


N-e+i 


N 1 

N-e+i\ 


a N-e+1 


, a N-e +1 


) 

) 


max (|| log(< 7 # 1 /< 7 „)||, || \og(v p /v^)\\). 
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On the other hand, if the supreme is achieved at some N < £ then 


^ - Sv 


log 


TbeA N ~ e v n( a b) 


< max 

aeA N 


l0 § E 


J2cGA N ~ e V v( ac ) 
v^(ab) 


v v {ab) 


beA »-‘ V, '( ab ' ) E cG A«-^( aC ) 


< max 

aeA N 


log max 


,(ab) 


= \\^og(v p /v v 


’beT”~* v v (ab) | 

Finally, since both v p and v v are probability vectors, we have 

iii / i \ii i ((1 1 ,, V u (cl') , , . 

||l°g(^/^)|| < max log —- mill log—= d p (v p , v v ) 

flG-A Vj/ yCL) a G-A 'VjyyCL) 


and with this 


< max (|| ^og(g lt /g l ,)\\, dp^n, v„)) 

< 211 log(g AJ /g„) 11e min ( svar ^ log °9» > svar * log °gv). 


□ 
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